Self-training (machine learning)
A variant of self-supervised learning that is particularly useful when all of the following conditions are true:1
- The ratio of unlabeled examples to labeled examples in the dataset is high.
- This is a classification model problem.
Self-training works by iterating over the following two steps until the model stops improving:
- Use supervised learning to train a model on the labeled examples.
- Use the model created in Step 1 to generate predictions (labels) on the unlabeled examples, moving those in which there is high confidence into the labeled examples with the predicted label.
Notice that each iteration of Step 2 adds more labeled examples for Step 1 to train on.